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Abstract 

o 

O^l ' We consider the problem of finding the sparsest input vector which makes a given Unear system controhable. 

^ . Specifically, given a matrix A g R"^" define B be the set of vectors b in M." which make the linear system 

Q-i' X = Ax + bu controllable. We show that the problem of finding the sparsest vector in B is NP-hard, and even 

■^^ . approximating the number of nonzero entries of the sparsest vector in B within a multiplicative factor of 

clog n is NP-hard for some positive c. Moreover, this remains the case even when the matrix A is symmetric. 
On the positive side, we show that it is possible to find, in polynomial time, a vector fo G 6 with at most 
c' log n as many nonzero entries as the sparsest vector in B for some positive c' . This is achieved by a 
I- s' , simple greedy heuristic which sequentially adds nonzero entries to 6 to maximize the rank increase of the 

^^ ' controllability matrix. 

^' 
•^ ! 1 Introduction 

This paper considers the problem of controlling large-scale systems where it is impractical or impos- 
^\j ' sible to affect more than a small number of variables. We are motivated by the recent emergence 

K^ ■ of control-theoretic ideas in the analysis of of biological circuits [H], biochemical reaction net- 

works [11], and systems biology [21], where systems of many reactants often need to be driven to 
^^ I desirable states with inputs which influence only a few key variables. 

C^ ' While controllability aspects of linear and nonlinear systems have been amply studied since 

tJ" . the introduction of the concept by Kalman [8], it appears that with the exception of a few recent 

papers little attention has been paid to the interaction of controllability with sparsity. There has 

been much recent interest in the control of large networks which have a nearly unlimited number of 

interacting parts - from the aforementioned biological systems, to the smart grid [3l[27], to traffic 

, , , systems [HIS] - leading to the emergence of a budding theory of control over networks (see the 

r^ • surveys [ISpiTj and the references therein) . The enormous size of these systems makes it costly to 

C^ ■ affect them with inputs which feed into a nontrivial fraction of the states, naturally motivating the 

study of sparse control strategies. 

We will be considering here the continuous-time linear time-invariant system 

X = Ax + bu (1) 

We will assume that the matrix A € M"-^" is known to us and we would like to find a vector 
b with fewest nonzero entries so that the system of Eq. ([T]) is controllable. We will refer to this 
henceforth as the minimal controllability problem. Intuitively, we would like to design an open-loop 
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or feedback controller for the system x = Ax, which may be thought of as a large-scale system 
which we can afford to influence through an input in only a few places. 

Our work follows a number of recent papers on the controllability properties of networks [2l[6l 
[7t[9Hl3 | [T6 t[T8l - [20| l24 p 25j with similar motivations. These papers sought to find connections between 
the combinatorial properties of graphs corresponding to certain structured matrices A and vectors 
b and the controllability of Eq. ([1]). It is impossible to give a brief overview of this considerable 
literature, but we refer the interested reader to the recent survey [6]. We note that the existence 
of easily optimizable necessary and sufficient combinatorial conditions for controllability would, of 
course, lead to a solution of the minimal controllability problem we have proposed here; however, 
to date it is only for matrices A corresponding to influence processes on simple graphs like paths 
and cycles that such conditions have been found [21[T5] . 

We mention especially the recent papers |10yi2j. which have the a similar starting point as the 
present paper. It is observed in these papers that a brute-force approach to finding the sparsest 
vector b which makes Eq. ([1]) controllable might involve testing controllability with all of the 
2" — 1 possible sparsity patterns, which is computationally prohibitive. This motivates the authors 
of |10yil| to consider instead the problem of finding the sparsest vector which only makes Eq. ([TJ 
structurally controllable. 

Our paper has two main results. The first of them rigorously confirms that intractability of the 
minimal controllability problem, and shows that this problem is even intractable to approximate. 
Let us adopt the notation B for the set of vectors in M" which make Eq. ([1]) controllable. We then 
have the following theorem. 

Theorem 1.1. Approximating the number of nonzero entries of the sparsest vector in B within a 
multiplicative factor of clog n is NP-hard for some absolute constant c > 0; moreover, this remains 
the case even under the additional assumption that the matrix A is symmetric. 

Thus not only is it impossible to compute the sparsest b unless P = NP, it is NP-hard to even 
approximate the number of entries in this vector within a multiplicative logarithmic factor. This 
intractability result is discouraging and arguably motivates the approach taken by [IOl[Tl] which 
sought instead alternative problem formulations which are tractable. 

However, we next argue that in many cases it makes sense to stick with the minimal control- 
lability framework. Indeed, from the perspective of designing scalable control laws, a vector 6 € i5 
which is reasonably sparse will do. One would like to avoid the worst-case scenario where b has 
roughly as many nonzero entries as the dimension n, which may be enormous. Thus a natural 
approach is to consider relaxing the requirement of finding the sparsest b E B to merely finding a 
reasonaby sparse b £ B. 

Unfortunately, unless P = NP, the above theorem rules out the possibility of finding a vector 
b £ B with clog?2 as many times nonzero entries as the sparsest vector in B. However, since logn 
scales quite gracefully with ;i, it may suffice to find a vector b G B whose sparsity matches this 
barrier. Our next theorem shows that this is possible. 

Theorem 1.2. There exists an algorithm whose running time is polynomial in n and the number 
of bits in the entries of A which, under the assumption B is nonempty, returns a vector in B whose 
number of nonzero entries is at most d log n times the sparsest vector in B, for some absolute 
constant d > 0. 



^ An absolute constant is a constant which does not depend on any of the problem parameters, i.e., in this case it 
does not depend on A or n. 



As we will show, the algorithm which achieves this is a simple greedy heuristic which sequen- 
tially adds entries to b to maximize the rank increase of the controllability matrix. Moreover, our 
simulation results on matrices A deriving from directed Erdos-Renyi random graphs show that this 
algorithm usually finds a vector in B with only a very small number of nonzero entries. 

The rest of this paper is dedicated to the proof of these two theorems. Section [2] contains the 
proof of Theorem 11.11 and a few of its natural corollaries. Section [3] contains the proof of Theorem 
11.21 We report on the results of a simulation in Section [Hand some conclusions are drawn in Section 

El 

2 Intractability results for minimal controllability problems 

This section is dedicated to the proof of Theorem 1 1 . 1 1 and its consequences. We will break the proof 
of this theorem into two parts for simplicity: first we will show that the minimal controllability 
problem is NP-hard for a general matrix A, and then we will argue that the proof can be extended 
to symmetric matrices. 

The proof itself proceeds by reduction to the hitting set problem, defined next. 

Definition 2.1. Given a collection C of subsets of {1, . . . , m\, the minimum hitting set problem, 
asks for a set of smallest cardinality that has nonempty intersection with each set in C. 

The minimum hitting set problem is NP-hard and moreover is NP-hard to find a set whose 
cardinality is within a factor of clogn of the optimal set, for some c > (see [Il[22] for this 
hardness result for the set cover problem, easily seen to be equivalent). Moreover, it is easy to see 
that we can make a few natural assumptions while preserving the NP-hardness of the hitting set 
problem: we can assume that each set in C is nonempty and we can assume that every element 
appears in at least one set. We will argue that hitting set may be encoded in minimal controllability, 
so that the latter must be NP-hard as well. We next give an informal overview of our argument. 

The argument is an application of the the PBH test for controllability which tells us that to 
make Eq. ([1]) controllable, it is necessary and sufficient to choose b that is not orthogonal to all the 
left-eigenvectors of the matrix A (a proof may be found in textbooks on linear systems, e.g., |26j). 
It is easy to see that if A does not have any repeated eigenvalues, this is possible if and only if 
the support of b intersects with the support of every left-eigenvector of A. Defining C to be the 
collection of supports of the eigenvectors, this turns into a minimum hitting set problem. 

However, the resulting hitting set problem has some structure, so that we cannot conclude that 
minimal controllability is NP-hard just yet; for example, clearly the sets in the collection C defined 
in the previous paragraph are the supports of the columns of an invertible matrix, so they cannot 
be arbitrarily chosen. In the case where A is symmetric and its eigenvectors orthogonal, there is 
more structure still. The challenge is therefore to encode arbitrary hitting set problems within this 
structure. 

We now proceed to the formal argument. First, however, we introduce some notation which we 
will use throughout the remainder of this paper. We will say that a matrix or a vector is A'-sparse if 
it has at most A' nonzero entries, and we will say that the matrix A is fc-controllable if there exists 
a vector b £ B which is A-sparse which makes Eq. ([T]) controllable. We will sometimes say that a 
vector b makes A controllable, meaning that Eq. ([1]) is controllable with this particular A, b. We 
will use Ik to denote the k x A identity matrix, Okxi denote the A x / zero matrix, and e^xi to mean 
the k X / all-ones matrix. Finally, e^ will denote the i'th basis vector. 



Proof of Theorem M.li First part. Given a collection C oi p nonempty subsets of {1, . . . ,?n} such 
that every element appears in at least one set, we define the incidence matrix C € IRP^™ where we 
set Cij = 1 if the i'th set contains the element j, and zero otherwise. We then define the matrix V 

in R(p+"^+-^)^(^+™+-^^ as 

C {m + l)lp Opxi 

Olxm Oixp 1 

By construction, V is strictly diagonally dominant, and therefore invertible. We set 

A{C) = y-Miag(l, ...,m+p + l)V 

Note that A{C) has distinct eigenvalues and its left-eigenvectors are the rows of V . Moreover A 
may be constructed in polynomial-time from the collection C. Indeed, since every element appears 
in at least one set, we have that it takes at least niax(r?2,p) bits to describe C, and inverting V 
and performing the necessary multiplications to construct A can be done in polynomial time in 
r?i,p [23]. 

We claim that C has a hitting set with cardinality k if and only if A{C) is k + 1- controllable. 

Indeed, suppose that a hitting set S" of cardinality k exists for C. Set 6j = 1 for all i G S" and 
6m+fc+i = 1; set all other entries of b to zero. By construction, b has positive inner product with 
every row of V. The PBH test then implies that the system of Eq. ([TJ with this b is controllable. 

Conversely, suppose the system is controllable with a k + 1-sparse vector b. Once again we 
appeal to the PBH condition, which gives that b is not orthogonal to any row of V^. Since V is 
nonnegative, we may take the absolute value of every entry of b: the PBH test implies this operation 
preserves the controllability of Eq. ([1]). Moreover, the PBH condition for A{C) and nonnegative b 
is that for any row i of V, there is some index j such that both V^y and bj are positive. We will 
refer to this as the intersection property. 

By considering the last row of y, we see that the intersection property immediately implies 
that b„i-\-k+i is positive. We now argue that we can find a k + 1-sparse vector b whose support is 
contained in {!,..., m} U {m + p + 1} such that the system of Eq. ([TJ is controllable. Indeed, if 
bi > for some i ^ {!,..., m} U {m+p+ 1} then we modify b by setting first 6j = and then bi = 1 
for any index / € {1, . . . , m} such that Vu > 0, i.e. any / belonging to the z'th set of C. In so doing, 
we preserve the controllability of Eq. ([T]) since, by construction, for i ^ {1, . . . , rn} U {m+p+ 1} no 
row besides i has i'th entry positive, so the intersection condition still holds. Moreover, the vector 
b certainly remains k + 1-sparse. 

Proceeding in this way for every i ^ {1, . . . , m} U {m+p+ 1} we finally get a A; + 1-sparse vector 
b whose support is contained in {1, . . . , m} U {m + p -\- 1} and which makes Eq. ([T]) controllable. 
The intersection property then implies that that for each i = 1, . . . ,p there is some index j such 
that bj > and the m + i'th row of V is positive on the j'th entry. By construction, this means 
that supp(6) n {1, . . . , m} is a hitting set for C. Since b is k + 1-sparse and bm+p+i > 0, this hitting 
set has cardinality at most k. 

This concludes the proof of the italicized claim above that the solution of the minimal control- 
lability problem is one more than the size of the smallest hitting set for C. This implies that the 
intractability guarantees for hitting set extend to minimal controllability. D 

This proves Theorem 11.11 without the additional assumption that A is a symmetric matrix. 
Before proceeding to describe the proof in the symmetric case, we illustrate the construction with 
a simple example. 



Example: Suppose 



Here r?^ = 3 and p 
The incidence matrix is 



C = {1,2}, {2, 3}, {1,3}, {1,2, 3}. 
4. The smallest hitting set is of size 2, but there is no hitting set of size 1. 
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Direct computation shows that choosing b = yl 1 1) makes the system 
controllable and our proof above implies that no 2-sparse vector b makes it controllable. 

We now proceed with the proof of Theorem 11.11 which remains to be shown in the case the 
matrix is symmetric. For this we will need the following lemma, which is a minor modification of 
known facts about the Gram-Schmidt process. 

Lemma 2.2. Given a collection of k orthogonal vectors vi, . . . ,vi. all in e^ in M", whose entries 
are rational with hit-size B, it is possible to find in polynomial time in n, B vectors ffc+i, . . . , u„ 
such that: (i) vi, . . . , Vn is an orthogonal collection of vectors (ii) the first component of each of the 
vectors Vj^j^i, . . . , v„ is nonzero. 

Proof. Set I'fc+i = ei and use Gram-Schmidt to find t'fc+2) . . . , I'n so that vi,..., Vn is an orthogonal 
basis for M". Now let S be the set of those vectors among I'^+i, . . . , t'„, which have zero first entry; 
clearly we have S = {k + 2,k-\-3, . . . ,n} since I'fc+i has first entry of 1 and zeros in all other entries 
and Vk+2i ■ ■ ■ ,Vn are orthogonal to ?7fc+i. 

Now given a collection t'l, . . . , y„ such that Vf^^i has first entry of 1, we describe an operation 
that decreases the size of S. Pick any vi £ S and update as 



vi 

Vk+l 



CVl + Wfc+1 
-Vl + Vk+1 



where c = H^fc+iHi/H^dli- Observe that the collection vi,...,Vn remains orthogonal after this 
update and that the new vectors t;fc+i,t;/ both have 1 as their first entry. 

As long as there remains a vector in S, the above operation decreases the cardinality of S by 
1. It follows that we can make S empty, which is what we needed to show. It is easy to see that 
the procedure takes polynomial-time in 7i, B. D 

We now complete the proof of Theorem 11.11 

Proof of Theorem \l.l\ Second part. Let us introduce the notation nic(X) to mean the number of 
nonzero entries in a solution b of the minimal controllability problem with the matrix X. We will 
next construct, in polynomial time, a symmetric matrix A and prove that it satisfies the inequality 
mc(y4)/mc(^) G [1/3, 2]. Once this has been established we will then have that since approximating 
mc(A) to within a multiplicative factor of clogn is NP-hard for some c > 0, the same holds for 
mc(A). 

We will construct A as follows: we will take V and add (™+|'+ ) + 1 columns to it; then we will 
add a corresponding number of rows to make the resulting matrix square. We will call the resulting 
matrix V; we have that V € M'"^*' where r = ?n + p + 2 + ("' I* ) . 

We now describe how we fill in the extra entries of V . Let us index the first ('"+P+ ) additional 
columns by {i, j} for i, j = 1, . . . ,m + p + 1. If the i'th and j"th rows of V have nonzero inner 
product, we set the one of ^,,{i,j}) ^'.{ij} to 1 and the other to the negative of the inner product 
with the i'th and j'th rows of V; else, we set both of them to zero. All other additional entries in 
the first rn + p + 1 rows of V will be set to zero. Note that by construction the first rn. + p + 1 rows 
of V are orthogonal. 

As for the extra rows of V , we will fill them in using Lemma 12.21 to be orthogonal to the 
existing rows of V and have a nonzero entry in the r'th coordinate. By construction the rows 
of V are orthogonal. Finally, A = y^^diag(l, 2, . . . ,r)V. Note that A is symmetric and the the 
left-eigenvectors of A are the rows of V . 

Now that we have constructed A, we turn to the proof of the assertion that mc(j4)/mc(A) € 
[1/3,2]. Indeed, suppose A is A>controllable, i.e., there exists a fc-sparse vector h € M™+p+^ which 
makes Eq. ([1]) controllable. We then define define a fc + 1-sparse vector h G W by setting its 
first m + p +1 entries to the entries of h and setting its ?''th entry to a random number generated 
according to any continuous probability distribution, say a standard Gaussian distribution; the rest 
of the entries of h are zero. By construction, rows l,...,m-|-p-|-lofy are not orthogonal to h; and 
the probability that any other row is orthogonal to h is zero. We have thus generated a fc + 1-sparse 
vector b which makes A controllable with probability 1 . We thus conclude that there exists a fc + 1 
sparse vector which makes A controllable, for if such a vector did not exist, the probability that b 
makes A controllable would be zero, not one. Finally, noting that {k + l)/k < 2, we conclude that 
mc{A) < 2mc{A). 

For the other direction, suppose now that A is controllable with a A>sparse vector b. We argue 
that there exists a /c + 1 sparse vector b' making A controllable which has the property that 6'- = 
for all j ^ {1,. . . ,m-|-p-|- 1} U {r}. 

Indeed, we now describe how we can manipulate b to obtain 6'. Let S be the support of b. 
If some element j not in {1, . . . ,m + p + 1} L) {r} is in S, we remove it by setting bj = 0; now 
observe that at most two of the first m + p + 1 rows of V have a nonzero entry in the j'th place; 
we add two elements in {1, . . . ,rn + p + 1} to S, one from the support of each of those two rows 
within {l,...,?n-|-p-|-l}, by setting the corresponding entries of 5 to a continuous positive random 



variable. Specifically, suppose that j & S for some j ^ {1, . . . , m + p + 1} U {r}, and Vi^^j > and 
Vj2 J > for 11,12 € {1, . . . ,m + p+ 1} are the two rows among the first m + p + 1 with a nonzero 
coordinate in the j'th spot; we then pick any index ki E {1, . . . , r?i + p + 1} in the support of the 
?'i 'st row of V and index ^2 G { 1 , . . . , m + p + 1 } in the support of the 12 'nd row of V and we set set 
6fcj and b^^ to be, say, independent uniform random variables on [1,2]. Finally, we add r to S by 
setting bj. to a positive continuous random variable, to to make sure that the rows m + p + 2, . . . ,r 
have supports which overlap with the support of b. 

Proceeding this way, we end up with a support S which is at most three times as many elements 
as it had initially - because each time we remove an index we add at most three indices - but which 
has no element outside oi {1, . . . ,m + p + 1} L) {r}. Finally, we note that for any row of V, the 
probability that it is orthogonal to b' is zero. Thus with probability one, the vector // makes A 
controllable. Consequently, we conclude there exists a 3A; sparse vector that makes A controllable. 

It now follows that the vector b £ ]K™+p+i obtained by taking the first m + p+l entries of b' 
makes A controllable. Thus nic{A) < 3mc(^). This concludes the proof. D 

We now illustrate the proof with an example intended to make the construction clearer. 
Example: In the interest of keeping the dimensions of the resulting matrices from getting too 
large, let us take a simpler example than before. Let 

C = {{1},{1,2}}. 

Thus {1} is a hitting set of size one, but not {2}. We have that 
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Note th at t he rows of this marix are orthogonal by construction. We can now use the procedure of 
Lemma [2.2l to complete this to a matrix with orthogonal rows with positive final entry. Instead, we 
do a quick approximation here using the MATLAB "qr" command. Our approximation is based on 
the observation that running Gram-Schmidt process with vectors eg+eie, er+eie, . . . , eis+eie, eis 

produces the desired result. The matrix V, as computed by the MATLAB "qr" command to three 



digits of precision, is 
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Because this is a matrix with orthogonal rows (up to a certain degree of precision - if we used the 
procedure of Lemnia [2.2l we would have a matrix with exactly orthogonal rows, but because we used 
MATLAB's built in operations, this matrix has orthogonal rows up to some round-off error) , the ma- 
trix A is symmetric. As we have argued above, the vector 6 = (l 00010000000000 X^ 
where X is a continuous random variable makes A controllable with probability one. A direct com- 
putation in fact reveals that in this case even 6=(l000 1 0000000000 O) 
makes the system controllable, so that mc{A)/inc{A) = 1 here. Consistent with our proof, the vec- 
tor (OIOOIOOOOOOOOOOX) does not make A controllable because {2} 
is not a hitting set for C. 

We now proceed to state and prove several natural corollaries of Theorem ll.il The first of these 
states that the problem remains NP-hard if the vector b is replaced by a matrix with an arbitrary 
number of columns. 

Corollary 2.3. For any integer I > 1, the problem of finding a matrix B G M"^ with smallest 
number of nonzero entries such that the system 

X = Ax + Bu 

is controllable is NP-hard. Moreover, even approximating the smallest possible number of nonzero 
entries in such a B to within a factor of clog n is NP-hard for some c > 0. This remains the case 
even if the matrix A is assumed to be symmetric. 

Proof. Consider the matrix A G R'' constructed in the proof of Theorem II. 1[ We claim that a 
fc-sparse vector /; G R'' exists that makes Eq. ([T]) controllable if and only if a A>sparse matrix 
B G W'^ exists with this property. 

One direction is trivial: if a A;-sparse b exists, then certainly a A;-sparse matrix B exists - take 
B = [6 0]. 

For the reverse direction, suppose such a fc-sparse matrix B exists. Let b be a random vector 
such that bj is equal to an i.i.d. standard Gaussian if the i'th row of B has a nonzero entry, and 
zero otherwise. Clearly, any vector obtained this way is fc-sparse. Moreover, the PBH condition 
applied to (A, B) implies that for every row vi of V , there exists at least one column of B which 
has a nonzero entry in the same index at which Vi has a nonzero entry. It follows that for every 
1',;, there is an index I such that the /'th entry of vi is nonzero and bi is drawn from a standard 
Gaussian. This implies that y, is orthogonal to b with probability zero, and since this is true for all 
Vi we conclude that the random vector b makes A controllable with probability 1. It follows there 
exists a A:-sparse vector b which makes A controllable. D 



A variation of the minimal controllability involves finding matrices B and C that make the 
system both controllable and observable, with as few nonzero entries as possible. Unfortunately, 
the next corollary proves an intractability result in this scenario as well. 

Corollary 2.4. For any li > IJ2 > 1 finding matrices B G M"^'i,C € M'^'^" with the smallest 
total number of nonzero entries (in both B and C) such that the system 

X = Ax + Bu 

y = Cx (2) 

is both controllable and observable is NP-hard. 

Proof. Consider the matrix V which was constructed in the proof of Theorem II. 1[ We first argue 
that V~^ has the same pattern of nonzero entries as V, with the exception of the final column of 
V~^, which is nonzero in every entry. We will show this by explicitly computing V'^^. 
Indeed, for i = 1, . . . ,m we have 

{Vx)i = 2Xi + Xm+p+l 

and moreover {Vx)p^m+i = ^^p+m+i so that the z'th row of V'^^ is immediate: it has a 1/2 in the 
i, i'th place and a —1/2 in the i,m + p + I'th place, and zeros elsewhere. 
Next for i = m + 1, . . . ,m + p, we have 

(yx)i = (m + l)xi + ^ Xj 

where Ci is the corresponding set of the collection C. Consequently, V~^ has a \/{rn + 1) in the 
z, i'th spot, a — l/(2(r7i+l) in the i, j'th place where j G C;, and \Cj\/{2{m + l)) in the z, ?n+p+l'th 
place; every other entry of the i'h row is zero. Finally, the last row of V^^ is clearly [0, 0, ... , 0, 1]. 

This explicit computation shows that V^"^ has the same pattern of nonzero entries as V , with 
the exception of the final column of V^^ which is nonzero in every entry. Now for the system of 
Eq. ([2]) to be observable, we need to satisfy the PBH conditions, namely that each column of V~^ 
is not orthogonal to some row of C . By considering columns m^ + 1, . . . , 771, + p + 1, we immediately 
see that C needs to have at least p+ 1 nonzero entries; moreover, it is easy to see that p + \ nonzero 
entries suffice, for example by setting the last p +1 entries of the first row of C to 1, and all other 
entries to zero. 

Thus the problem of minimizing the number of nonzero entries in both B and C to make Eq. 
([2]) both observable and controllable reduces to finding the sparsest B to make it controllable. We 
have already shown that this problem is NP-hard. D 

3 Approximating minimal controllability 

We will now consider the problem of controlling a linear system with a vector which does not have 
too many nonzero entries. Our goal in this section will be to prove Theorem 11.21 showing that we 
can match the lower bound of Theorem 11.11 by returning a c' log n approximate solution for the 
minimal controllability problem of a matrix in A € M"^". 

For simplicity, let us adopt the notation |6| for the number of nonzero entries in the vector b. Our 
first step will be to describe a randomized algorithm for the minimal controllability problem which 
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assumes the ability to set entries of the vector b to Gaussian random numbers. This is not a plausible 
assumption: in the real-world, we can generate random bits, rather than infinite-precision random 
numbers. Nevertheless, this assumption! simplifies the proof somewhat and clarifies the main ideas 
involved. After giving an algorithm for approximating minimal controllability with this unrealistic 
assumption, we will show how to modify this algorithm to work without this assumption. The final 
result is a a deterministic polynomial-time algorithm for c' log ?i-approximate controllability. 

We begin with a statement of our randomized algorithm in the box below (we will refer to this 
algorithm henceforth as the randomized algorithm). The algorithm has a very simple structure: 
given a current vector b it tries to update it by setting one of the zero entries to an independently 
generated standard Gaussian. It picks the index which maximizes the increase in the rank of the 
controllability matrix C{A, h) = {b Ab A^b ■ ■ ■ A'^^^b). When it is no longer able to increase 
the rank of C(A, b) by doing so, it stops. 

1. Initialize b to the zero vector and set c* = 1. 

2. While c* > 0, 

(a) For j = l,...,n, 

i. If bj = 0, set 6 = 6 + X{j)ej where X{j) is i.i.d. standard normal, 
ii. Set c{j) = iank{C{A,b)) - ra.nk{C{A,b)). 
end 

(b) Let j* € argmaxj c{j) and let c* = c{j*). 

(c) If c* > 0, set 6 ^ 6 + X(j*)ej.. 

end 

3. Output b. 



Our first proposition will show that if B (recall, B is the set of vectors which make the system 
of Eq. ([T]) controllable) is nonempty, then this algorithm returns a vector in B with probability 1 . 
Moreover, the number of nonzero entries in a vector returned by this algorithm approximates the 
sparsest vector in ;J3 by a factor of c'logn, for some c' > 0. 

Without loss of generality, we will be assuming henceforth that each eigenspace of A is one- 
dimensional: if some eigenspace of A has dimension larger than 1 then B is empty and all the 
results we will prove in this section are vacuously true. 

Let us adopt the notation Ai, . . . , A^ for the eigenvalues of A. Let T be a matrix that puts A 
into Jordan form, i.e., T~^AT = J, where J is the Jordan form of A. Due to the assumption that 
every eigenspace of A is one-dimensional, we have that J is block-diagonal with exactly k blocks. 

Without loss of generality, let us assume that the i'th block is associated with the eigenvalue 
Aj. Moreover, let us denote its dimension by dj, and, finally, let us introduce the notation t{i,j) 
to mean the di + d2 + ■ ■ ■ + dj^i + j'th row of T^^. Observe that for fixed ?', the collection 
{i{h J) I J = 1) • • • ) di} comprise the rows of T^^ associated with the i'th block of J. Naturally, the 
entire collection 

T= {t{i,j), i = l,...,k, j = l,...,di} 
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is a basis for M". For a vector v G R" we use v[i] to denote the vector in M.'^'- whose /c'th entry 
equals the di + • • • + (ij_i + A;'th entry of v. We will say that the vector t{i,j) G T is covered by b 
if {t{i,j'), 6) / for some j' < di such that / > j; else, we will say that t{i,j) is uncovered by 6. 

Our first lemma of this section provides a combinatorial interpretation of the rank of the con- 
trollability matix in terms of the number of covered t{i,j). While the lemma and its proof are quite 
standard, we have been unable to find a clean reference in the literature; consequently, we provide 
a self contained proof here. 

Lemma 3.1. rank{C{A,b)) equals the number of covered t{i,j) G T. 

Proof. Let Uf^ denote the upper-shift operator on M'^ and let Jj be the i'th diagonal block of J so 
that Ji = \,Id, + Udr Since rank(C(A,6)) = Tank{C{T-^AT,T-^b)), defining b, = {T-^b)[i] we 
have 

k 

rank(C7(A,6)) = ^rank((6, J,6, jffo, ... jf^-^b. 
Since the rank is unaffected by column operations, we can rewrite this as 

k 



i=l 



iank{C{A,b)) = J2''ank((b, U^M UJM ■■■ U'^\ 



But each term in this sum is just the largest index of a nonzero entry of the vector 6,. That is, 
letting Zi be the largest j G {1, . . . , d^} such that the j'th entry of 6; is nonzero, we get 



rank(C7(^,6)) = ^ 



k 
^ Zi 
i=l 



This is exactly the number of covered elements in T. D 

Our next lemma uses the combinatorial characterization we have just derived to obtain a per- 
formance bound for the randomized algorithm we have described. 

Lemma 3.2. Suppose that B is nonempty and let b' be a vector in B. Then with probability 1 the 
randomized algorithm, outputs a vector in B with 0(|6'jlogn) nonzero entries. 

Proof. Since b' G i3, we know that every t{i,j) G T is covered by //. Thus, in particular, if J-" is 
any subset of T, then every t{i,j) G J^ is covered by b' . Prom this we claim that we can draw the 
following conclusion: for any J- C T there exist an index j G {1, . . . , n} such that the choice b = Cj 
covers at least l-Fj/l^'l of the vectors in T. 

Indeed, if this were not true, then for every index j the number of t(^i,j) such that some t{i,j') 
with j < j' < di has a nonzero j'th entry is strictly fewer than than |J-'|/|5'|. It would then follow 
that there is at least one t{i,j) such that no t{i,j'),j < j' < di has a nonzero entry at any index 
where b' is nonzero. But this contradicts every t(i,j) being covered by b' . 

Let us consider the Tth time when the randomized algorithm enters step 2. a. Let J-{1) be the 
number of uncovered vectors in T at this stage. We have just shown that there exists some index 
j such that at least |J-'(Z)|/|6'| of the vectors in J-{1) have nonzero j'th entry. With probability 1, 
all of these vectors will become covered when we put an independent standard Gaussian in j'th 
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entry, and also with probability 1, no previously covered vector becomes uncovered. Thus we can 
iner the following conclusions with probability 1: first 

|7-(/ + l)|<|7-(0|-^, 

since the index picked by the algorithm covers at least as many vectors as the index j; and also 
since (1 — -)^' < e~^ for all x > 1, we have that |-7^(0I shrinks by a factor of at least e^^ every 
j5'j steps. After 0(|6'|logn) steps, |-7^(0I is strictly below 1, so it must equal zero. This proves the 
lemma. D 

By choosing // to be the sparsest vector in B and applying the lemma we have just proved, 
we have the approximation guarantee that we seek: the randomized algorithm finds a vector in B 
which is an O(logn) approximation to the sparsest vector. 

We next revise our algorithm by removing the assumption that we can generate infinite-precision 
random numbers. The new algorithm is given in a box below, and we will refer to it henceforth as 
the deterministic algorithm. 

1. Initialize b to the zero vector and set c* = 1. 

2. While c* > 0, 

(a) For j = l,...,n, 

i. If bj = 0, then for p = 1, . . . , 2n + 1, set bj^p = b + pej . 
ii. Set c{j,p) = Tank{C{A,bj^p)) - Tank{C{A,b)). 
end 

(b) Let ij*,p*) G argmax(j_p) c{j,p) and let c* = c{j*,p*). 

(c) If c* > 0, set 6 ^ 6 + p*ey . 

end 

3. Output b. 



The deterministic algorithm has a simple interpretation. Rather than putting a random entry 
in the j'th place and testing the corresponding increase in the rank of the controllability matrix, 
it instead tries 2n + 1 different numbers to put in the j'th place, and chooses the one among them 
with the largest corresponding increase in rank. The main idea is that to obtain the "generic"' 
rank increase associated with a certain index we can either put a random number in the j'th entry 
or we can simply test what happens when we try enough distinct values in the i"th entry. 

We are now finally ready to prove Theorem 11.21 

Proof of Theorem \1.2[ We will show that when the deterministic and the randomized algorithms 
enters step 2.b with identical A,b, we will have that c{j) in the randomized algorithms equals 
inaxpc{j,p) in the deterministic algorithm with probability 1. This implies that with probability 
1 the sets argmaxj c(j) and arg max^^p c(j, p) are identical. We have already shown an O(logn) 
approximation to the sparsest vector in B for the randomized algorithm (which picks an arbitrary 
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index from arg maxj c{j) at each step); it is easy to see that showing the above fact will immediately 
imply the 0{logn) approximation for the deterministic algorithm as well. 

Now we turn to the proof of the claim in the above paragraph. Fix A, b,j, and consider Pm{t) 
which we define to be the sum of the squares of the determinants of all the m x m minors of 
the matrix C{A, b + tej). Naturally, Pmit) is a nonnegative polynomial in t of degree at most 2n. 
Moreover, the statement Pm{t) > is equivalent to the assertion that C{A, b + tej) has rank at least 
m. Note that because Pm{t) is a polynomial and therefore has finitely many roots if not identically 
zero, we have that as long as Pmit) is not identically zero, Pm{t) evaluated a standard Gaussian 
is positive with probability 1. Consequently, whenever the randomized algorithm enters step 2.a.ii 
and computes c(j), we have that with probability 1 this c{j) equals the largest m G {1, . . . , n} such 
that p,n{t) is not identically zero. 

Similarly, consider the deterministic algorithm as it enters step 2. a. Observe that as long as 
Pm{i) is not identically zero, plugging in one of 1, 2, . . . , 2?i + 1 into Prn{t) will produce a positive 
number because Pm{t) has degree at most 2n. As a consequence, max/j c(j. A;) equals the largest 
m such that Pm.{i) is not identically zero. This proves that each c{j) in the randomized algorithm 
equals max^ c(j. A;) in the deterministic algorithm with probability 1. This concludes the proof of 
the claim from the first paragraph and consequently concludes the proof. D 

4 A simulation 

We briefly report on a result of a MATLAB simulation or our randomized algorithm. We used the 
MATLAB "randn" command to approximately produce a Gaussian random variable. To obtain 
the matrices A, we generated a directed Erdos-Renyi random graph where each link was present 
with a probability of 21ogn/n. We constructed the adjacency matrices of these graphs and threw 
out those which had two eigenvalues which were at most 0.01 apart. The purpose of this was to rule 
out matrices which had eigenspaces that were more than one dimensional, and due to the imprecise 
nature of eigenvalue computations with MATLAB, testing whether two eigenvalues were exactly 
equal did not ensure this. Thus the resulting matrix was controllable with some b. 

We then ran the randomized algorithm of the previous section to find a sparse vector making 
the system controllable. We note that a few minor revisions to the algorithm were necessary due to 
the ill-conditioning of the controllability matrices. In particular, with A being the adjacency matrix 
of a directed Erdos-Renyi graph on as few as 20 vertices and b a randomly generated vector, the 
MATLAB command "rank(ctrb(A,b))" often returns nonsensical results due to the large condition 
number of the controllability matrix. We thus computed the rank of the controllability matrix by 
initially diagonilizing A and counting the number of eigenvectors orthogonal to b. 

We ran our simulation up to number of nodes n = 100, generating 100 random graphs for each n 
and we found that almost all the matrices could be controlled with a b with just one nonzero entry, 
and all could be controlled with a b that had two nonzero entries. Specifically, out of the 10, 000 
random adjacency matrices generated in this way, 9, 990 could be controlled with a 1-sparse v and 
the remaining 10 could be controlled with a 2-sparse b. It appears that the randomized protocol can 
successfully find very sparse vectors making the system controllable in this randomized scenario. 
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5 Conclusions 

We have shown that it is NP-hard to approximate the minimal controhabihty factor within a factor 
of clogn for some c > 0, and we have provided an algorithm which approximates it to a factor of 
c'logn for some d > 0. Up to the difference in the constants between c and r/, this resolves the 
polynomial-time approximability of the minimal controllability problem. 

The study of minimal controllability problems is relatively recent and quite a few open questions 
remain. For example, it would be interesting to understand for which matrices A the minimal con- 
trollability question can be solved in polynomial-time. This relates to the combinatorial structure 
of the eigenvectors of A and will likely require theorems relating the combinatorics of the nonzero 
entries of A to the combinatorics of the nonzero entries of the matrix of eigenvectors. 

It is reasonable to anticipate that real-world networks may have certain "generic" features 
which considerably simplify the minimal controllability problem. It would therefore be interesting 
to find algorithms for minimal controllability which always return the correct answers and which 
"generically" finish in polynomial time. Finally, an understanding of more nuanced features of 
controllability (i.e., how the energy required to move a network from one state to another depends 
on its structure) appears to be lacking at the moment. 
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